Automated Arabic Antonym Extraction Using a Corpus Analysis Tool

نویسندگان

  • LULUH ALDHUBAYI
  • MAHA ALYAHYA
چکیده

The automatic extraction of semantic relations between words from textual corpora is an extremely challenging task. The increasing need for language resources supporting Natural language processing (NLP) applications has encouraged the development of automated methods for the extraction of semantic relations between words. The use of corpus statistical and similarity distribution methods can help in the task of semantic relation extraction between pairs of words. In this paper, we present a pattern-based bootstrapping approach using Arabic language corpora and a corpus analysis tool (Sketch Engine) to extract the semantic relations (antonyms) between word pairs. The algorithm uses LogDice and pattern cooccurrence to classify the extracted pairs into antonyms. Results of evaluation show that our approach is able to extract the antonym relations with a precision of 76%.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Positive Negative Neutral Sentiment Analysis Using Dual Sentiment Analysis

Sentiment analysis or opinion mining aims to use automated tools to detect subjective information such as opinions, attitudes, and feelings expressed in text. Ideally, an opinion mining tool would process a set of search results for a given item, generating a list of product attributes (quality, features, etc.) and aggregating opinions about each of them (poor, mixed, good). Then begin by ident...

متن کامل

Arabic Entity Graph Extraction Using Morphology, Finite State Machines, and Graph Transformations

Research on automatic recognition of named entities from Arabic text uses techniques that work well for the Latin based languages such as local grammars, statistical learning models, pattern matching, and rule-based techniques. These techniques boost their results by using application specific corpora, parallel language corpora, and morphological stemming analysis. We propose a method for extra...

متن کامل

Arabic News Articles Classification Using Vectorized-Cosine Based on Seed Documents

Besides for its own merits, text classification (TC) has become a cornerstone in many applications. Work presented here is part of and a pre-requisite for a project we have overtaken to create a corpus for the Arabic text process. It is an attempt to create modules automatically that would help speed up the process of classification for any text categorization task. It also serves as a tool for...

متن کامل

Identifying Metaphoric Antonyms in a Corpus Analysis of Finance Articles

Using a corpus of 17,000+ financial news reports (involving over 10M words), we perform an analysis of the argument-distributions of the UP and DOWN-verbs used to describe movements of indices, stocks and shares. In Study 1 people identified antonyms of these verb sets in a free-generation task and a match-theopposite task and the most commonly identified antonyms were compiled. In Study 2, we ...

متن کامل

Multi-Level Analysis and Annotation of Arabic Corpora for Text-to-Sign Language MT

The Arabic language is morphologically rich and syntactically complex with many differences from European languages, and this creates a challenge when porting existing annotation tools to Arabic. In this paper, we present an ongoing effort in lexical semantic analysis and annotation of Modern Standard Arabic (MSA) text, a semi automatic annotation tool concerned with the morphologic, syntactic,...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014